Flat Minima
نویسندگان
چکیده
We present a new algorithm for finding low-complexity neural networks with high generalization capability. The algorithm searches for a "flat" minimum of the error function. A flat minimum is a large connected region in weight space where the error remains approximately constant. An MDL-based, Bayesian argument suggests that flat minima correspond to "simple" networks and low expected overfitting. The argument is based on a Gibbs algorithm variant and a novel way of splitting generalization error into underfitting and overfitting error. Unlike many previous approaches, ours does not require gaussian assumptions and does not depend on a "good" weight prior. Instead we have a prior over input-output functions, thus taking into account net architecture and training set. Although our algorithm requires the computation of second-order derivatives, it has backpropagation's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms conventional backprop, weight decay, and "optimal brain surgeon/optimal brain damage".
منابع مشابه
Simplifying Neural Nets by Discovering Flat Minima
We present a new algorithm for finding low complexity networks with high generalization capability. The algorithm searches for large connected regions of so-called ''fiat'' minima of the error function. In the weight-space environment of a "flat" minimum, the error remains approximately constant. Using an MDL-based argument, flat minima can be shown to correspond to low expected overfitting. Al...
متن کاملDynamical Supersymmetry Breaking versus Run-away Behavior in Supersymmetric Gauge Theories
We consider Dynamical Supersymmetry Breaking (DSB) in models with classical flat directions. We analyze a number of examples, and develop a systematic approach to determine if classical flat directions are stabilized in the full quantum theory, or lead to run-away behavior. In some cases pseudo-flat directions remain even at the quantum level before taking into account corrections to the Kähler...
متن کاملتغییرات نوری ناهنجار در دوتایی گرفتی آر-زد ذات الکرسی
UBV ligh curves together with color curves of the semi-detached eclipsing binary RZ Cas are presented. The light curves are analyzed and the spectroscopic elements and the Hipparcos information are used to compute the absolute parameters of the system. The light curve anomalies and occasional flat minima are discussed. Based on the existing evidence, a straightforward explanation for the prim...
متن کاملAffleck-Dine baryogenesis in anomaly-mediated SUSY breaking
It has been known that in anomaly-mediated SUSY breaking model Affleck-Dine baryogenesis does not work due to trapping of Affleck-Dine field into charge-breaking minima. We show that when finite-temperature effect is properly taken into account and if reheating temperature is relatively high, the problem of falling into charge breaking global minima can be avoided and hence Affleck-Dine baryoge...
متن کاملLate Quaternary geomorphology and soils in Crater Flat, Yucca Mountain area, southern Nevada
Soil-geomorphic studies indicate that six major allostratigraphic units occur in Crater Flat, Nevada, adjacent to Yucca Mountain. These units are, from youngest to oldest, Crater Flat, Little Cones, Late Black Cone, Early Black Cone, Yucca, and Solitario. Presence and degree of differentiation of Av, Ak, Bw, Bt, Btk, Btkq, and Bqkm genetic soil horizons characterize units, confirm relative ages...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neural computation
دوره 9 1 شماره
صفحات -
تاریخ انتشار 1997